Conversation
bcherry
left a comment
There was a problem hiding this comment.
lgtm, lmk when it's ready for final review
| enable_filter: Any = None, | ||
| filter_options: dict[str, Any] | None = None, |
There was a problem hiding this comment.
I would be more eager to add strict typings here.
I think we should remove the filter_options and put it inside the KrispFilter constructor.
I was imagining something like:
rtc.AudioStream(krisp=KrispFilter(options)) # options can be automatically determined by our third package that will find the dynamic lib inside our wheel
I'm not sure why we make it a generic interface today, feels easier to explicitly mention Krisp, wdyt?
EDIT: another option is to mention Enhanced Noise Suppression or Background Voice Cancellation instead of Krisp
There was a problem hiding this comment.
Since I'm not familiar with Python, any suggestion like this is super welcome! Let me think about this
| { | ||
| "url": url, | ||
| "token": token, | ||
| } |
There was a problem hiding this comment.
Should we make it strictly typed inside our protobufs definition instead of using json? The auth function could accept those as arguments (no need for a prost dependency)
livekit-rtc/livekit/rtc/room.py
Outdated
| def __init__( | ||
| self, | ||
| loop: Optional[asyncio.AbstractEventLoop] = None, | ||
| filters: Optional[List[Any]] = None, |
There was a problem hiding this comment.
Using the constructor is tricky because users don't construct it on agents, we do it for them. IMO we should only expose this API inside the rtc.AudioStream for now
There was a problem hiding this comment.
Yeah, I think so. But the Krisp filter needs to be initialized with the LIVEKIT_URL and TOKEN parameters to verify the token. So, I've initialized it here.
Instead, do you mean that it would be better to pass all of this information in the AudioStream constructor, as you suggested earlier? Or creating a separate init function to pass these info? Something like:
krisp.init(url, token)
or something like that. I think the latter is better
There was a problem hiding this comment.
no strong opinion, tho I know dz wanted to avoid the auth on the python side (so grabbing the url/token on the rust side)
| capacity: int = 0, | ||
| sample_rate: int = 48000, | ||
| num_channels: int = 1, | ||
| audio_filter: Optional[Tuple[str, dict[str, any]]] = None, |
There was a problem hiding this comment.
Any way to get something that looks more like #350 (comment)
This type isn't self-explanatory
There was a problem hiding this comment.
I intended to cover this in the filter module side, because the option parameter will vary depending on the filter module. It looks something like this:
audio_filter=noise_filter.NC()
Please refer to the noise_filter repo for more details
There was a problem hiding this comment.
I'm not sure why we want to make this generic, if we have more filters in the future we could also add new keyword arguments, I propose to change the argument name to be directly krisp
I also doubt we're going to have more filters? (Noise cancellationEcho cancellation should be exposed directly to Python with raw API). Because we need to use it on external speakers not available on the Rust side (not tied to any rtc.AudioStream ― I wouldn't design for it inside this PR)
I think we should narrow down the API for now
There was a problem hiding this comment.
Exposing the raw API is better in terms of architecture, and I agreed with this in our previous discussion. But it ended up with it's not good with our metric system, wasn't it? So we decided to extend AudioStream for now.
In that case, I would say that it would be better to handle the filter options in noise_filter side and to keep AudioStream generic. This would make updates to noise_filter easier. What do you think?
There was a problem hiding this comment.
Sorry, I meant the echo cancellation in the previous message.
What I mean is why we want to keep the name generic like audio_filter instead of krisp
e.g krisp=KrispFilter(options)
or could also be noise_cancellation=krisp.AudioFilter(options)
There was a problem hiding this comment.
do we really need to add this requirement to add the filters inside the RoomOptions? Can registering be skipped and happen implicitly?
Yes, I think we can do that. I hadn't considered that option, but I will if that's preferable.
Also, regarding the naming of livekit-plugins-krisp, if that's the case, I need to follow the common implementation rules for livekit-plugin, correct? I would really appreciate it if you could give me some pointers on that.
There was a problem hiding this comment.
If it isn't that much work to do, I think it is worth it. This way, people only need to focus on a single spot in their code to enable Krisp.
The plugin would need to be contributed inside the agents repository, we have a livekit-plugins-minimal example here.
There was a problem hiding this comment.
If it isn't that much work to do, I think it is worth it. This way, people only need to focus on a single spot in their code to enable Krisp.
Agreed. Will try to make it.
For the plugin repository, is it possible to implement a plugin in a separate repository? The krisp filter plugin needs to be bundled with non-opensource rust-ffi. so I don't think it's a good idea to put it in the plugin repo, as it would likely require implementing some tricky CIs
There was a problem hiding this comment.
yeah I think the plugin should exist in the krisp filters repo, since it's not the same license & closed sourced. we just have to publish a package that's livekit-plugins-krisp, and that should be enough.
There was a problem hiding this comment.
ok, I did the following changes:
- remove audio_filter parameters from RoomOptions, now filters are automatically registered.
- rename audio_filter parameter of rtc.NativeStream parameter to krisp.
So now we can do:
from livekit.plugins import krisp
# ...
stream = rtc.AudioStream.from_participant(
participant=participant,
track_source=rtc.TrackSource.SOURCE_MICROPHONE,
krisp=krisp.BVC(),
)Or something like that.
| capacity: int = 0, | ||
| sample_rate: int = 48000, | ||
| num_channels: int = 1, | ||
| noise_cancellation: Optional[Tuple[str, dict[str, Any]]] = None, |
There was a problem hiding this comment.
I'm not sure what the resolution in the other comment thread was but I'd suggest at least adding a type alias for this e.g. NoiseCancellationOptions so it's clear
also please add documentation to the docstring below
There was a problem hiding this comment.
Yep, I think doing dataclass would be great
There was a problem hiding this comment.
Added NoiseCancellationOptions as a dataclass and exported it for using it by noise-cancellation module.
Also a brief documentation was added!
| capacity: int = 0, | ||
| sample_rate: int = 48000, | ||
| num_channels: int = 1, | ||
| noise_cancellation: Optional[Tuple[str, dict[str, Any]]] = None, |
There was a problem hiding this comment.
same here, please document the new parameter
| capacity: int = 0, | ||
| sample_rate: int = 48000, | ||
| num_channels: int = 1, | ||
| noise_cancellation: Optional[Tuple[str, dict[str, Any]]] = None, |
There was a problem hiding this comment.
same here, please document the new parameter
| @dataclass | ||
| class NoiseCancellationOptions: | ||
| module_id: str | ||
| options: dict[str, Any] |
There was a problem hiding this comment.
What should the options include? This is the final unclear point.
There was a problem hiding this comment.
The options are eventually encoded to JSON and will be passed to the FFI as Cstr.
The noise-cancellation module should have functions like:
def BVC() -> rtc.NoiseCancellationOptions:And users this this type via the function like:
from livekit.plugins import noise_cancellation
# ...
rtc.AudioStream(
# ...
noise_cancellation=noise_cancellation.BVC(),
)So users should not create this Options by their hand.
livekit-rtc/livekit/rtc/room.py
Outdated
| self._remote_participants[participant.identity] = participant | ||
| return participant | ||
|
|
||
| def _filter_handle(self, plugin) -> int | None: |
There was a problem hiding this comment.
This seems unused? Or maybe it is on the krisp-repo?
There was a problem hiding this comment.
Oh, this is unused now. It was used for associate room and filters, but it was moved to rust-sdk. I'll remove this.
The
rust-sdkimplementation (livekit/rust-sdks#559) must be merged before this pull request.